Similarity based clustering using the expectation maximization algorithm
نویسندگان
چکیده
In this paper we present a new approach for clustering data. The clustering metric used is the normalized crosscorrelation, also known as similarity, instead of the traditionally used Euclidean distance. The main advantage of this metric is that it depends on the signal shape rather than its amplitude. Under an assumption of an exponential probability model that has several desirable properties, the expectation-maximization (EM) framework is used to derive two iterative clustering algorithms. Numerical experiments were presented using simulated data in a dynamic positron emission topography study of the brain. Initial results demonstrated that the proposed method achieves the best performance when compared to several existing clustering methods.
منابع مشابه
Clustering-based Behavioural Analysis of Biological Objects
The article examines the problem of processing short time series for bioinformatics tasks using data mining methods in the field of pharmacology. The experiments were conducted using heart contraction (contraction and relaxation) power data that were obtained in experiments with laboratory animals with the goal of registering the power changes of heart contractions in different stages of experi...
متن کاملA Complex Networks Approach for Data Clustering
Many methods have been developed for data clustering, such as k-means, expectation maximization and algorithms based on graph theory. In this latter case, graphs are generally constructed by taking into account the Euclidian distance as a similarity measure, and partitioned using spectral methods. However, these methods are not accurate when the clusters are not well separated. In addition, it ...
متن کاملPre Processing Techniques for Arabic Documents Clustering
Clustering of text documents is an important technique for documents retrieval. It aims to organize documents into meaningful groups or clusters. Preprocessing text plays a main role in enhancing clustering process of Arabic documents. This research examines and compares text preprocessing techniques in Arabic document clustering. It also studies effectiveness of text preprocessing techniques: ...
متن کاملCo-Clustering the Documents and Words Using-IJCSEC
In this paper, we propose a novel constrained coclustering method to achieve two goals. First, we combine information theoretic coclustering and constrained clustering to improve clustering performance. Second, we adopt both supervised and unsupervised constraints to demonstrate the effectiveness of our algorithm. The unsupervised constraints are automatically derived from existing knowledge so...
متن کاملExpectation Maximization for Clustering on Hyperspheres
High dimensional directional data is becoming increasingly important in contemporary applications such as analysis of text and gene-expression data. A natural model for multi-variate directional data is provided by the von Mises-Fisher (vMF) distribution on the unit hypersphere that is analogous to multi-variate Gaussian distribution in R. In this paper, we propose modeling complex directional ...
متن کامل